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ABSTRACT 


In  this  paper,  we  propose  a  framework  for  the  design  of  linear 
decentralized  estimation  schemes  based  on  a  team- theoretic  approach. 
We  view  local  estimates  as  "decisions"  which  affect  the  information 
received  by  other  decision  makers.  Using  results  from  team  theory, 
we  provide  necessary  conditions  for  optimality  of  the  estimates.  For 
fully  decentralized  structures,  these  conditions  provide  a  complete 
closed-form  solution  of  the  estimation  problem.  The  complexity  of 
of  the  resulting  estimation  algorithms  is  studied  as  a  function  of 
the  performance  measure,  and  in  the  context  of  some  simple  examples. 
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1.  INTRODUCTION 


A  standard  problem  in  estimation  theory  consists  of  using  a  set  of 
available  information  about  a  random  variable  to  obtain  an  estimate  of 
its  value.  When  the  criterion  used  in  evaluating  the  estimate  is  the 
conditional  variance  of  the  estimate,  the  best  estimator  is  given 
by  the  conditional  mean.  However,  this  formulation  assvimes  that  all 
of  the  available  information  is  concentrated  at  a  central  location. 

In  many  areas  of  application,  such  as  Command  and  Control  systems  and 
meteorology,  the  acquisition  of  data  is  characterized  by  sensors 
which  are  spatially  and  temporally  distributed.  Thus,  there  are 
nontrivial  costs  associated  with  the  transfer  of  data  to  a  central 
location  for  the  purpose  of  estimation. 

An  approach  to  designing  estimation  algorithms  for  these  areas 
of  application  is  to  preprocess  some  of  the  data  at  various  local 
processing  nodes,  thereby  reducing  the  communication  load  on  the 
system.  The  result  is  an  estimation  scheme  with  a  fixed  structure 
(often  hierarchical) ,  and  constraints  on  the  available  information 
at  any  one  node.  Figure  1  depicts  a  typical  estimator  structure. 


Figure  1 


2 


The  structure  of  Figure  1  has  similarities  with  a  decentralized 
decision  problem.  In  this  paper,  we  propose  to  study  estimation  pro^ 
blems  with  fixed  estimator  structures,  hereafter  referred  to  as  dis¬ 
tributed  estimation  problems,  by  inbedding  the  estimation  in  a  class 
of  decentralized  decision  problems.  These  decision  problems  have 
special  structures  which  can  be  exploited  for  some  linear  Gaussian 
systems  to  obtain  closed-form  solutions  for  the  estimators.  In  part¬ 
icular,  the  decisions  variables  do  not  affect  the  evolution  of  the 
state  variables  and,  in  certain  cases,  they  do  not  affect  the  observa¬ 
tions  received  by  other  decision  makers.  This  latter  case  results 
in  a  partially  nested  decision  problem,  as  defined  in  Ho  and  Chu  HJ . 

There  has  been  a  significant  amount  of  recent  work  bn  the  subject 
of  distributed  estimation.  The  various  approaches  can  be  divided  into 
two  classes:  The  first  class  consists  of  methods  which  use  the  distri¬ 
buted  structure  of  the  problem  in  such  a  way  as  to  achieve  an  overall 
estimator  whose  error  corresponds  to  that  of  a  fully  centralized  esti¬ 
mator,  and  thus  optimality  is  achieved.  Elegant  solutions  to  some  of 
these  problems  are  presented  in  I2J ,  [3] ,  and  I4J .  The  second  class  of 
approaches  consists  of  utilizing  a  fixed  structure,  which  is  simple, 
to  achieve  the  best  performance  possible  with  this  restricted  structure. 
This  approach  can  seldom  achieve  the  performance  of  a  centralized  scheme. 
Typical  of  the  results  in  this  case  are  the  papers  of  Tacker,  Sanders  and 
their  colleagues  [5J ,  I6J . 

In  this  paper,  we  follow  the  spirit  of  the  second  approach.  Specifi¬ 
cally,  we  take  as  given  a  specific  architecture  of  processing  stations, 
with- pre specified  flows  of  information  among  them.  Given  this  structure, 
and  the  apriori  statistics  of  the  random  variables  present  in  the  system, 
we  restrict  the  data  processing  to  consist  of  linear  strategies  of  the 
available  data.  It  is  our  purpose  to  characterize  the  "best"  processing 
schemes  in  terms  of  an  overall  performance  measure;  our  estimation  problem 
will  thus  become  a  stochastic  team  problem,  where  a  number  of  decision 
agents  with  different  information  seek  to  minimize  a  common  goal. 
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Fixed  structure  decentralized  decision  problems  have  been  considered 
by  a  number  of  authors  [7],  [8],  and  [9].  Our  approach  in  this  paper 
follows  very  closely  the  formulation  of  Barta  [9]  for  linear  control  of 
decentralized  stochastic  systems.  Indeed,  most  of  the  results  of  Section 
4  of  this  paper  appear  in  Barta  and  Sandell  [10] . 

The  paper  is  organized  as  follows.  Section  2  contains  the  mathe¬ 
matical  formulation  of  fixed  structure  linear  estimation  problems  using 
a  decision  theoretic  viewpoint.  Section  3  presents  general  necessary 
conditions  which  optimal  estimators  must  statisfy.  These  conditions  are 
not  very  useful  due  to  their  complexity.  In  Section  4,  we  specialize 
the  results  of  Section  3  to  a  specific  structure  which  corresponds  to  a 
fully  decentralized  estimation  algorithm.  This  case  permits  significant 
analysis,  as  was  previously  done  in  Barta  and  Sandell  []0J.  We  extend 
their  results  to  illustrate  how  the  complexity  of  the  local  estimation 
algorithm  depends  on  the  importance  of  correlation  between  the  errors  of 
the  various  local  estimators.  Section  5  contains  some  simple  examples 
which  illustrate  the  results  of  Section  4.  Section  6  discusses  the  results 
and  areas  of  future  research. 

2.  MATHEMATICAL  FORMULATION 

Assiame  that  there  are  N  local  substations  and  one  coordinator  station 
in  the  decentralized  estimation  systems.  Denote  the  state  of  the  environ¬ 
ment  by  x(t),  an  R^-valued  random  process  on  [0,T]  whose  evolution  is 
governed  by  the  stochastic  differential  equation 

dx(t)  =  A(t)x(t)dt  +  B{t)dw(t),  (2.1) 

where  w(t)  is  an  R™-  valued  standard  Wiener  process.  Each  local  substation 
receives  data  from  local  measurements,  described  by  the  observation  equations 

dy^(t)  =  C^(t)x(t)dt  +  D^(t)dv^(t)  (2.2) 
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where  (t) ,  w(t)  are  standard,  mutually  independent  Wiener  processes,  and 

y. (t)  is  an  -  valued  random  process.  The  matrices  A(t),  B(t),  C. (t) ,  D. (t) 
X  it 

are  assumed  continuous  an  [0,T]  for  i  =  0, . . .N.  In  addition,  the  matrices 
(t)  are  assxamed  invertible  for  all  i,t. 

To  each  local  substation  corresponds  a  decision  agent,  whose  decisions 
are  denoted  by  u^(t)  in  R^i.  The  decisions  made  at  each  substation  depend 
only  on  real-time  observations  of  local  data,  as  in  equation  (2.2),  plus 
the  apriori  knowledge  about  the  statistics  of  the  systems.  The  apriori 
knowledge,  common  to  all  local  substations  and  the  coordinator  station,  con¬ 
sists  of  knowledge  of  the  matrices  A(t) ,  B(t),  C^(t),  D^(t),  for  i  =  0,...N, 
t  e  [0,T] ,  together  with  the  initial  distribution  of  the  initial  condition 
x(0) .  For  the  sake  of  simplicity,  we  assume  that  x(0)  is  a  zero-mean,  normal 
random  variable  with  covariance  E (0) . 

The  coordinator  station  receives  the  decision  outputs  of  all  the  local 
subsystems,  u^(t),  i  =  1,...N,  in  addition  to  an  independent  set  of  measure¬ 
ments  y^(t).  The  output  of  the  coordinator  station  is  denoted  by  u^ (t) ,  and 
it  is  based  on  real-time  observation  of  measurements  and  the  prior  decisions 
of  the  local  substations. 


Associated  with  the  estimation  structure  is  a  performance  index,  of  the 


form 


J  =  (aCt)  -  S(t)x(t)T  Q{t)  (u(t)  -  S(t)x(t))dt, 

where  u(t)  consists  of  the  vector  of  decisions. 


(2.3) 


u^{t)  =  (u^(t),...,u^(t)),  (2.4) 

and  the  superscript  T  denotes  transposition.  The  matrix  Q(t)  is  assumed 
positive  semidefinite  and  continuous  for  t  in  [0,T] .  With  this  performance 
criterion,  the  design  of  a  distributed  estimation  scheme  can  be  reduced  to 
determining  the  admissible  decision  strategies  which  minimize  the  quadratic 
function  J. 


5 


The  admissible  strategies  are  restricted  to  be  linear  maps  of  the 
available  information  which  yield  mean-square  integrable  decision  variables. 
Specifically,  since  equation  (2.2)  implies  that  the  local  observations  are 
corrupted  by  additive  white  noise,  we  assume  that,  for  i  =  l,...n, 

u  (t)  =  /  H. {t,s)dy. (s)  (2.5) 

Jq  ±  1 


where 


H^(t,s)  =  0  if  s  >  t. 


(2.6) 


and 


T  T 


Trace 


(t,s)H^(t,s)dtds  <  <». 


/  /  “i' 

•^o  "b 

For  the  coordinator,  we  assume  that 

T  N  t 

u  (t)  =  I  H  (t,s)dy  (s)  +  y  /  K. (t,s)u. (s)ds 
o  1  o  o  r 


n 


+l  L.(t)  u^(t) 
i=l 

where  H^,  satisfy  (2.6)  and  (2.7),  while  the  matrices  (t)  are 
continuous  on  [0,T] . 


(2.7) 


(2.8) 


The  parametrization  of  the  control  laws  in  equations  (2.5)  to  (2.8) 
results  in  admissible  strategy  spaces  which  are  Hilbert  spaces.  Specifically, 
the  admissible  strategies  for  u^,  i  =  1,...N,  are  elements  of  the  Hilbert  space 
of  linear  operators  from  ([0,T],  R^i)  to  R^^)  with  finite  trace, 

and  inner  product 

T  T  T 

<H^,  H^>  =  Trace  f  f  H^(t,s)H^  (t,s)dtds  =  Trace  (H^H^)  .  (2.9) 

For  additional  information  about  Hilbert  spaces  of  operators,  the 
reader  should  consult  Balakrishnan  [11] .  We  will  use  the  symbol  H^  with¬ 
out  its  arguments  to  refer  to  the  linear  operator,  while  H^(t,s)  will  be 
used  to  refer  to  the  kernel  of  the  operator. 
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The  assumption  of  linear  strategies  for  all  decision  agents  in  the 
problem  represents  a  restriction  on  the  class  of  admissible  strategies. 


However,  the  system  and  observations  described  by  equations  (2.1)  and  (2.2) 
result  in  zero-mean,  jointly  Gaussian  random  processes  x,y^, . . Since 
the  decisions  u(t)  do  not  affect  the  evolution  of  the  state  x(t)  (this  is 
a  property  of  estimation  problems)  for  any  control  law  u(t)  such  that 

T  2 

Ej  ]  lu(t)  II  dt  <  «>,  (2.10) 

•'o 

we  can  use  a  version  of  Fubini ' s  theorem  to  show 


|(u(t)-S(t)x(t) )  Q(t)  (u(t)-S(t)x(t) )  |dt.  (2.11) 

Notice  that  the  optimal  estimator  will  minimize  the  integrand 


=  E  I  (u(t)-S(t)x(t))’^Q(t)  (u(t)-S(t)x(t))  I  (2.12) 

almost  everywhere.  In  many  cases,  this  will  enable  us  to  show  that  the 
true  optimal  solution  belongs  to  the  admissible  class  of  linear  strategies. 

To  conclude  this  section,  we  will  discuss  some  relevant  examples,  and  indicate 
how  they  fit  in  this  framework. 


Example  1 :  Centralized  estimation 

Assume  that  N  =  0,  so  that  the  only  station  present  is  the  coordinator 
station.  In  this  case,  corresponds  to 

J  =  EUu  (t)-S(t)x(t)  )'^Q(t)  (u  (t)-S(t)x(t)) 

1  I,  o  ^  o 

Its  minimum  among  all  mean-square  integrable  achieved  at 

u  (t)  =  S(t)x(t)  (2.13) 

o 

where  x(t)  is  the  minim'um  variance  estimate  of  x^,  given  the  prior 
observations,  which  is  obtained  from  a  Kalman  filter.  Hence,  the  optimal 
estimator  is  linear. 
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Example  2 .  Hierarchical  Estimation 


Let  N  =  2 .  Furthermore ,  let  =  P2  =  n  and 


S(t)  = 


“l" 

"i 

0 

0" 

I 

Q(t)  = 

0 

I 

0 

_0 

0 

I_ 

Assume  C  (t)  s  0. 
o 


Then,  equation  (2.12)  yields 


(  2 

J  =  E  {  J  (u, 

^  U=o  ^ 


,  (t)  -X  (t) )  (u^  (t)  -x  (t)  )> 


We  consider  the  minimization  of  J^-  over  all  mean-square  integrable  decision. 

The  last  two  terms  in  the  sum  are  minimized  by  using  local  Kalman  filters  at 
each  local  substation.  Furthermore,  it  was  established  in  Willsky,  Castanon  et 
al  [2] ,  that  the  first  term  can  be  minimized  absolutely,  when  the  local  strate¬ 
gies  are  Kalman  filters,  by  a  strategy  of  the  form  (2.8).  Hence,  the  optimal 
hierarchical  estimator  for  this  problem  is  in  the  class  of  linear  estimators. 


Example  3 .  Fully  Decentralized  Estimation 

Assume  that  there  is  no  coordinator  station,  so  that  u_(t)  =  0  for  all  t. 
In  this  case, 

=;e  |(u(t)  -s(t)x(t)  )'^  Q(t)  (u(t)  -s  (t)x(t) ) 

For  each  t,  this  is  a  static  team  problem  with  jointly  Gaussian  statistics; 
hence,  Radner's  theorem  [12]  implies  that  the  optimal  decision  strategies  are 
linear  maps  of  the  available  observations,  and  hence  they  belong  to  the  linear 
class  in  equations  (2.5)  to  (2.8). 

Example  4.  Let  N=l,  p^=l,  p^=n,  and 

a- 


S(t) 
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Then, 


Jl  =  E  I  (u^  (t) -x(t)  )'^  (u^  (t) -X  (t)  )|- 

It  is  clear  that,  if  n  >  1,  some  form  of  nonlinear  encoding  of  the  infor¬ 
mation  will  provide  a  lower  value  of  than  the  best  linear  encoder, 

because  is  a  scalar  signal  and  x  is  a  vector  process.  In  this  case,  the 
optimal  decision  rules  are  nonlinear. 

In  many  cases,  the  optimal  estimation  strategies  will  be  nonlinear. 
Nevertheless,  there  will  be  a  person-by-person-optimal  linear  strategy 
which  will  be  of  interest  because  of  ease  of  implementation.  In  the  next 
Section,  we  provide  necessary  conditions  which  characterize  these  linear 
person-by-person  optimal  strategies. 

3 .  NECESSARY  CONDITIONS 


The  formulation  of  Section  2  imbedded  the  distributed  estimation  pro¬ 
blem  into  a  team  decision  problem  with  a  quadratic  criterion,  where  decision 
rules  are  elements  of  a  Hilbert  space  of  linear  operators.  In  this  section, 
we  provide  necessary  conditions  which  characterize  the  estimators  resulting 
from  this  approach.  The  mathematical  development  of  this  section  follows 
closely  the  development  in  Barta  [9] . 


In  operator  notation,  equations  (2.5)  and  (2.8)  can  be  written  as 


u.  =  H. (dy. ) , 
i.  11 


i  =  1, . .  .N 
u  =Hdy  +y  (K.u.  +L.U.) 

O  O  O  11  11 


N 


(3.1) 

(3.2) 


1=1 


where  is  the  linear  operator  with  kernel 


L.(t,s)  =  L.(t)6(t-s) 
1  1 


(3.3) 


Furthermore,  the  quadratic  functional  (2.4)  can  be  written 'as 
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J  =  E  ly  (u(t)-S  {t)x(t)  )'^Q(t)  (u(t)-S(t)x(t))dt 

=  Trace  ]  S*QSy  +  Q)  2q)  S*> 

I  ^xx  ^uu  -  ^^ux  I 

where  }  ,  }  ,  and  )  are  the  covariance  operators  [11]  corresponding  to  the 

random  processes  x(t)  and  u(t).  Note  that  the  decision  operators  are 
implicit  in  defining  u(t)  as  a  random  process. 

Let '  s  partition  u  as 


u(t) 


[u  (t)  1  u  (t)  .  .  .u  (t)  =  [u  (t),u(t)]'^ 

Oil  N  o 


(3.5) 


Then,  )  can  be  partitioned 
^uu 


I 

u 

o  o 

* 

I  - 

U 

o 


u  u 
o 


Furthermore,  u(t)  is  related  to  y(t)  by 


(3,6) 


u(t) 


y(t) 


(3.7) 


=  diagjH^  [  y(t) 


so  that 


y —  =  (diag  H.)  y,  ,  (diag  H.) 

^uu  1  ^dydy  ^  i 

Similarly, 

1  “  =  r  H  ^  H*  .  .  H  y  H*  1  + 

u  o^dy  dy,  1  o^dy  dy„  N 
o  -^o  1  o  -^N 


(3.8) 
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(3.9) 


N 

V 


+/  [(K.+L.)H.  ),  ,  H*  ...(K.+L.)H.  ),  ^  H*] 

111  ^dy.dy  1  i  i  i  ‘^dy.dy  N 

1  -i-  1  JL  H.  ^4 


and 


N 

y  =  H  y ,  ,  H*  +  y  {h  y^  ,  h^(k*  +  l*) 

u  o  ^dy  dy  o  o  ^dy  dy.  i  i  i 

o  o  o  o  1=1  1 


+{K.+L.)H.  y*  ,  H*} 

111  ^dy  dy.  o 
o  1 


N  N 


+  y  y  (K,+L.)H.  y^  h*  (k.+l.)  o.io) 

itljil  ^  ^  ^  ^dy.dy.  3  3  3 


A  similar  partition  yields 

i. 


I 


ux 


-u  X 

o 


LI; 


ux 


(3,11) 


where 


N 

I  =  H  L  +  I  (K.+L. ) 

X  o  ^dy  X  11 

o  o  1=1 

I-  =  [H.  I  •  --H,  L  L 
^ux  1  ^dy^x  N  dy^^x 

Using  equations  (3.6)  -  (3.13)  in  equation  (3.4),  we  can  express  the  functional 
J  as  a  deterministic  quadratic  function  of  the  operators  H^,  L^,  K^,  which  are 
elements  of  a  linear  Hilbert  space.  We  will  denote  this  dependence  by 

J  =  J  (H,  L,  K)  (3.14) 


H.  I 
1  ^dy.x 
1 


(3.12) 


(3.12) 
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since  J  is  a  quadratic  functional,  and  the  linear  operators  H,  L,  K  are 
elements  of  Hilbert  spaces,  we  can  compute  the  Frechet  differential  of  J 
[13]  with  respect  to  variations  in  the  operators.  In  particular,  we  will 
denote  the  Frechet  differential  of  J  in  the  direction  of  each  of  the  com¬ 
ponents  of  H,  K  and  L.  Partition  the  operators  Q,  S,  according  to  equations 
(3.6 )  ,  as 


(3.14) 


(3.15) 


Then,  we  can  use  equations  (3.6)  -  (3.15)  to  obtain  the  Frechet  differentials 

N 


6  J(H,K,L,H  )  =  2  Trace 

o 


Q  H  y ,  +  I  Q  (k,+l.)h.L  ^ 

oo  o  ^dy  dy  '"oo  i  i  i^dy  dy. 

o  o  1=1  ^  ■’ 


o  1 


+  Q 


ol 


rH 


1  '^dy  dy, 

-  ■'o  1 


H  r  ^ 


-  Q  s  r 

oo  o  ^dy  X 
o 


-  Q 


ol 


dy  X 
■^o 


6  ,.J(H,K,L;K,+L.  )  =  2  Trace 

K .  +1, ,  i  i 

1  1 


H 

o 


-■dy  dy. 
■^o  -^1 


H* 

1 


(3.16) 


N 


(K.+L.)  H. 
3  3  3 


-dy . dy . 
j  1 


H*  + 
1 


Q 


ol 


y  H* 

^dy.dy  i 
1  1 


I 


k 


N 


«ij 


-  Q  S  H*  -  Q  ,  S, 

^oo  o  ^dy.x  1  ol  1 
1 


(3,17) 
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6„  J(H,K,L  ;  H.)  =  2  Trace 
H  1 


N 


[(K*  +  L*)Q  H  L  ^ 

[_  1  1  oo  o  ^dy^dy^ 


o  1 


+  I  (K.+L.)  Q  (K.+L.)  H.  y  - 

1  1  oo  :  3  3  ^dy^dy^ 


+  H  y ,  ,  +  (K.+L. ) 

lo  o  ^dy  dy.  i  i 

■'o  1 


r  i  r  * 

Q  H.,  ^ 

ol  1  ^dy.dy, 
1  1 


N  Y* 

®ol  My.dy 

1  N 


.  N  N 

+  y  (K.+L.)H.  y,  ^  +  y  o  -  h  > 

^lo.^  3  j  j  ^dy.dy.  ^11  3  ^dy.dy. 

3=1  D  1  3=1  3  i 


+  I  B(j  H,  ^ 


"ic  ^ 

-  (K.+L.)  (Q  S  +Q  S  )  y^ 

1  1  OO  o  ol  1  ^dy. 


dy  .X 
1 


in*  1  * 

-(Q,  S  +  Q.,  S.  )  y ,  H. 

lo  o  *"11  1  ^dy^xj  1 


(3.18) 


1  11  1 

where  ®  blocks  partition  in  the  corresponding  partition 

of  u(t)  =  (u,  (t)  , .  .  .u  (t)  )'^. 

N 

Using  expressions  (3.16)  -  (3.18),  we  can  provide  necessary  conditions  for 
optimality  of  a  set  of  linear  maps  (H,K,L) ,  as  follows: 


Proposition  3.1  If  H,K,L  minimize  the  functional  J  over  the  space  of 

all  linear  maps, then 


(a)  6j  (H,K,L  ;  H  )  =  0 
H  o 

o 


6j  _  (H,K,L  ;  K.  +  L.)  =  0 
K.+L.  1  1 

1  1 


.6j  (H,K,L  ;  H.)  =  0 
H .  1 

1 
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for  all  i=l,...N,  and  for  all  admissible  K. ,L. ,H  and  H  . 

1  1  i  o 

Proof  The  proof  follows  directly  from  Theorem  1  in  Chapter  7  in  Luenberger  [13] , 
since  the  existence  of  Frechet  differentials  provides  an  expression  for  the 
Gateaux  differential,  which  must  be  zero  at  a  minimum. 

Proposition  3.1  can  be  used,  together  with  the  fact  that  admissible  operators 
H,K,L  are  Volterra  integral  operators,  to  obtain  sets  of  coupled  integral 
conditions  which  characterize  the  optimal  solution,  in  a  manner  similar  bo 
Wiener-Hopf  factorization  [14] .  We  will  not  do  so  here,  focusing  instead 
on  obtaining  the  expressions  which  characterize  the  optimum  in  the  specific 
case  of  equations  (2.2)  -  (2.3)  for  the  fully  decentralized  case  in  the  next 
section. 


4-  FULLY  DECENTRALIZED  ESTIMATION 

In  the  fully  decentralized  case,  the  coordinator  station  is  absent. 

In  terms  of  the  formulation  of  section  3,  the  operators  K. ,L.  and  H  are 

11  o 

identically  zero,  as  are  the  weighting  matrices  S  ,  Q  ,  o  and  0  ,  for  all 

o  *^oo  ^qL  ^lo 

time  t  in  [0,T] .  This  causes  an  extensive  simplification  in  the  equations 
of  Proposition  3.1.  Specifically,  equation  (3.18)  now  becomes 


6^-  J(H;H.) 
H .  1 

1 


la  a 

3  ^«iYjdy^ 


(QllSi) 


(4.1) 


The  equivalent  set  of  integral  equations  corresponding  to  equation  (4,1) 
are 


N 

I 

j=l 


Q 


ID 

11 


/ 


(s^,s)ds^-  (Q^;lS 


l(t)^  I 


xdy^ 


(t,s) 


(4,2) 


A  similar  equation  can  be  found  in  Barta-Sandell  [10] ,  where  a  solution  is 
found  using  an  innovations  approach.  We  will  present  a  different  derivation 
of  their  results  in  this  section. 
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Assume  that  >  0  and  is  constant  in  time.  This  implies  that  the 

cost  functional  J  is  strictly  convex,  so  that  there  is  a  unique  minimum, 
which  is  characterized  by  the  integral  equation  (4.2) .  Furthermore,  as¬ 
sume,  without  loss  of  generality,  that  all  decisions  u^  are  scalar -valued , 
that  is  =  1  for  all  i.  A  vector-valued  decision  can  be  decomposed  into 
p^  stations  with  the  same  information.  Hence,  the  assumption  in  equation 
(2.3)  that  the  v^  are  mutually  independent  Wiener  processes  will  be  re¬ 
moved  at  this  stage,  to  allow  for  this  development. 

We  begin  by  noting  that  equation  (4.2)  is  a  linear  equation  driven 
by  a  sum  of  terms  in  the  right  hand  side.  Hence,  by  superposition,  the 
optimal  solution  Hj(t,s)  can  be  written  as 


N  n 


H.  (t,s)  =  I  I  G .  (t,s)  S„,  (t) 

^  -e=ik=i  ^  ^ 


(4.3) 


where  G.  (t,s)  minimizes  J  when  S  =  ,  that  is,  it  has  a  one  in  the  bk  th 

D  ^k"^^ 

entry  and  zero  elsewhere.  Hence,  G.  (t,s)  solves 


N 


IN  .  .  f-  «  .  n 

j  ”•!  o  3  i  ic  1 


(4.4) 


Notice  that  the  form  of  Q  determines  the  form  of  the  linear  system  on  the 

-€k 

left  side.  It  is  possible  to  solve  for  all  G,  simultaneously,  because  of 
the  consistency  of  the  problems  (4.4).  Let  denote  the  cost  function  J 
when  s  =  <5^.  Then, 


(Gf’^,  ...G'^)  =  argmin  j'^(G'^) 

g£k 


(4.5) 


T 

Define  a  global  cost  J  ,  given  by 


£=lk=l 


N  N 


£k  -£k. 


(4.6) 


T  .  ^  ^ 

The  cost  J  is  separable  in  its  arguments.  Hence,  minimization  of  J  corres- 

ponds  to  solving  equation  (4.5)  for  each  £,k. 
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Let's  examine  closely  the  nature  of  the  costs  J  .  Prom  equation 

Ik 

(2.4),  J  corresponds  to 


J*.E 


-  .T  J 

|i  Q  (u(t)  -  5^_^x^  (t))dt(  (4.7) 


where  S.  o  is  a  vector  with  all  zeroes  except  a  one  in  the  £'th  entry. 
^  ^  £k 

Furthermore,  minimization  of  is  accomplished  by  minimizing 


=  E  (u(t)  -  6^_^Xj^(t))'^Q(u(t)-5^_^x  (t)) 


(4.8) 


for  each  t.  Let  d  (t)  correspond  to  the  n  x  N  matrix 


d^(t)= 


(4.9) 


representing  the  decision  variables  associated  with  problems  ,  k=l,...n 
in  (4.6).  Let  D(t)  be 


X(t)  = 


’X(t) 


be  an  n  N  X  N  matrix.  Then,  a  simple  calculation  establishes  that 


=  Trace  E  j(D  (t) -X  (t) )  Q  (D  (t) -X  (t) ) '^ 


(4.10) 


where  the  i-th  column  of  D(t)  is  a  linear  function  of  the  local  observation 
process  (t)  only. 
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This  is  the  same  formulation  used  in  Barta-Sandell  [10] .  We  will 
state  their  main  result  without  proof,  as  it  applies  to  systems  of  the  form 
(2.2)  -  (2.4).  Before  we  can  do  so,  we  must  introduce  some  notation. 

The  state  process  of  equation  (2.2)  is  given  by 

dx  (t)  =  A(t)  x(t)dt  +  B(t)dw(t)  (4.11) 

with  local  observations 


dy. (t)  =  C. (t)  x(t)dt  +  D, (t)dv, (t) 
i  i  i.  i 


(4.12) 


where  v. (t) ,  w(t)  are  standard  Brownian  motions  with  w(t)  independent  of 
1 

all  v^ (s) . 

Let 


A(t)  =  diag  (t)  , .  .  .A(t)| 

B(t)dw(t)  =  diag  |b (t)dw(t) , . . .B (t)dw(t)| 


{< 


C(t)  =  diag^C^(t), — C^(t)| 


(4.13) 


then,  we  have 

dX(t)  =  A(t)X(t)dt  +B(t)dw(t)  (4.14) 

Define  also 


I  (t)  = 

^ww 


2ii^  •••  2in^ 


diag  [B  (t) B^ (t)  , ,  .  ,B  (t)B'^ (t)  ]  (4,15) 


as  the  enlarged  system  relevant  driving  noise  intensity. 
Similarly,  define 
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1  (t)  = 


Q^lDi(t)Di{t) 


Ql^E  p^(t)dv^(t)dv^(t)D^  (t) 


(4.16) 


as  the  enlarged  system  relevant  observation  noise  intensity.  With  this 
notation,  the  main  result  of  [10]  is: 


Proposition  4 . 2  The  Decentralized  Kalman  Filter 

The  optimal  team  decision  rule  for  equation  (4.10),  X(t) ,  satisfies 

dX^(t)  =  A(t)  X^(t)dt  +  K(t)  [I^dy^(t)  -  C(t)X^(t)l  (4.17) 

where 


K(t)  =  ^(t)  C'(t)  ^(t) 


M 


T  T  T  T  r 

I .  =  [O  , . . . I  , . .  .0  ]  is  a  y  m.  xm.  dimensioned  matrix  with 

1  j  1 


the  identity  in  its  ith  blocH,  and  2 (t)  solves  the  Ricatti  equation 

.T  , .  ,  ^ .  r  __T 


I  =A(t)I  +jAht)  -  K(t)^K^(t)  + 


(4.16) 


1(0)  = 


^ll"  • • • 


o\ I  .  .  .  O  I 
^Nl 


diag  [L,...L]. 


The  estimator  of  Proposition  4.2  is  depicted  in  Figure  2.  The  striking 
feature  of  this  estimator  is  that  each  local  agent  uses  identical  esti¬ 
mation  systems,  of  dimension  NnxN,  differing  only  in  the  input  used  to 
drive  the  systems.  However,  in  many  applications,  these  estimators  are 
much  larger  than  are  necessary.  In  particular,  it  is  important  to  note 
that  it  is  the  presence  of  Q  which  creates  nontrivial  couplings  in  the 
team  problem,  leading  to  large-dimension  estimators. 
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When  Q  is  diagonal,  the  expresions  for  L „  (t)  and  T  „  (t)  are  block-diagonal. 
In  this  case,  it  can  be  established  that  (t)  ,  as  given  by  equation  (4.16) 
will  also  be  block -diagonal ,  and  the  optimal  estimator  will  decompose  into 
blocks  of  much  smaller  dimension.  We  formalize  this  in  the  following 
proposition. 

Proposition  4.3  Assume  Q  is  diagonal.  Then,  the  optimal  decision  rule  which 
minimizes  (4.10)  can  be  synthesized  using  n-dimensional  estimators  at  each 
local  station. 

The  proof  follows  directly  from  equations  (4.15)  and  (4.16).  In  the 
next  section,  we  will  study  some  specific  examples  to  illustrate  the  com- 
lexity  of  the  algorithm  of  Proposition  4.2,  and  the  relation  of  the  off- 
diagonal  elements  of  the  matrix  Q  with  this  complexity. 

5 .  EXAMPLES 

In  this  section,  we  discuss  some  examples  of  fully  decentralized  esti¬ 
mation  problems,  indicating  their  relation  with  the  results  of  section  4.  To 
facilitate  the  understanding  of  the  examples,  we  will  discuss  only  non- • 
dynamic  Gaussian  systems. 

Example  1.  Let  be  independent,  zero-mean  Gaussian  random  variables 

with  unit  variance.  Define  the  two  observation  equations 

+  v^  (5.1) 

y2  =  ^2  +  v^  (5.2) 

where  v^,  v^/  x^,  x^  are  mutually  independent,  normal,  zero-mean  random  vari¬ 
ables  with  unit  variance. 

Assume  that  there  are  two  local  substations.  Each  substation  i  has 
access  to  its  own  measurement  y^^.  The  performance  of  the  elements  is  to 
be  evaluated  as 
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J  =  E 


a  i\  /x. 


^0  1/  \x. 


1  1/2^ 
1/2  1  ; 


Vu, 


a  1 


^0  1 


V 

) 


=  E  jcu^-x^-x^)^  +  ^'^2-^2! 


(5.3) 


Conditioning  on  inside  the  expectation  of  equation  (5 . 3)  ,  and  differentiating 


with  respect  to  yields 


2u^  -  2  E  {x^|y^}  =  0 


Similarly,  conditioning  on  y  and  differentiating  with  respect  to  u  yields 


2u2  -  2E  {x2ly2}  -  E  ■tx2|y2}=  0 


Hence, 


Uf  =  E  {xjy^} 


(5.4) 


U2  =  2  E  {X2ly2} 


In  this  example,  ®  “[^0  ij  *  If  S  =  [p  ij  '  ^ 


it  is  clear  that 


'"l  =  ^  Wl^l^ 


(5.5) 


"2  =  ® 


is  the  optimal  decentralized  estimator.  Now,  let  S 


Then, 


J  =  E  {(U3_-X2)^  +  +  (^1-^2^ ^^2^ 


conditioning  with  respect  to  y^  and  differentiating  with  respect  to  yields 
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0 


2u, 


Repeating  for  y  and  u  yields 


2u^  -  E  =  0 


Hence,  for  S 


■« 


,  the  optimal  strategy  is 


=  0 


U2  -  1/2  E  {x2ly2}. 


As  indicated  in  Section  4,  the  solution  for  S  =r  ^ 


of  the  solutions  for  S  =(^  ?)and  S  =|  ^ 

\0  1/  \0  0 


IS  the  superposition 


The  presence  of  the  off  diagonal  elements  of  Q  =  J/2^1^]  important  in 
creating  the  nature  of  the  solution.  Notice  that,  in  spite  of  the  inde¬ 
pendence  and  x^,y2,  that  the  optimal  estimator  for  S  ^1  i® 

^  -J 


not 


fl  1\ 
^0  1} 


E 

.E  (X2ly2}. 


Example  2 


Assume,  in  example  1,  that  x^ 
the  optimal  solution  for  S 


^2 •  Repeating  the  same  logic  for  obtaining 
we  obtain  the  sufficient  conditions 


2u^  -  5  E  {x^|y^}  +  E{u2ly^}  =  0 
2u2  -  4  E  {x2ly2}  +  E^U]_|y2^  =  0 


(5.5) 


the  coupled  equations  (5.5)  can  be  solved  by  noting  that  u^  =  ay^,  u  =  by  , 
for  some  constant  a,  b.  Equation  (5.5)  becomes 
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(5.6) 


2u^  -  5  eCx^Iy^}  +  bEfy^ly^}  =  0 
2u^  -  4  Elx^ly^}  +  aE{y^jy2}  =  0 


Now, 


E{y2ly3_}  =  E{x^ly^} 

E{yj_ly2}  =  E{x2ly2} 


So , 


b  y^  =  2  -  a/2  E{x2ly2} 


Rewriting  in  terms  of  contents  , 

a  +  b/4  =  5/4 

b  +  a/4  =  1 


(5.7) 


so  a  =  16/15  (5.8) 

b  =  11/15 

Equation  (5.8)  was  obtained  by  solving  the  simultaneous  euqations  obtained 
from  the  variational  arguments.  For  differential  systems,  these  equations 
will  be  coupled  integral  equations  which  are  hard  to  solve. 

Let's  establish  the  solution  (5,8)  using  the  decomposition  approach  of 
Section  4.  Let  S  =  .  Then,  the  performance  measure  is 

2  2 
J  =  e{ (u^-x^)  +  (U^-X^)U2+U2J 

Variational  arguments  yield 
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(5.9) 


2u^  -  2  E  {x^ly^}  +  E{u2|y^}  =  0 


2m^  -  E{x2ly2}  +  sCu^ly^}  =  0 


which  imply  a  =  7/15  ,  b  =  2/15 


By  symmetry,  the  solution  for  ^  is 


a  =  2/15  b  =  7/15 


For  S 


the  performance  measure  is 


J  =  e{(u3_-X2)  +  (Ui-X2)u2  +  U2} 


2  2 
=  e{ (u  -X  )  +  (u  -X  )u  +  U„} 

11  1  1  2  2 


which  has  already  been  solved,  yielding 

a  =  7/15  b  =  2/15 

1  1 

Summary  all  three  yields  the  result  for  S  =  ^  ^  as 
a  =  7/15  +  2/15  +  7/15  =  16/15 

b  =  4/15  +  7/15  =  11/15 

We  will  now  use  proposition  4.2  directly  to  solve  example  2.  Since  x^  =  X2» 
the  effective  state  dimension  is  1.  Hence,  the  matrix  D  in  Section  4  has 
dimension  2x2,  with  the  first  column  a  function  of  y^,  while  the  second 
column  is  a  fucntion  of  y2.  The  overall  team  cost  is  given  as  in  (4.10),  by 


The  optimal  solution  X  is  characterized  by 
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-  (o  x)  S  °''{=  " 


(5.10) 


for  any  D  whose  first  column  is  a  function  of  y^,  and  its  second  column  a 
function  of  y^.  Let 


X  = 


^1^1  ^1^2 
^^2^1  ^2^2 


(5.11) 


Equations  (5.10)  and  (5.11)  imply 


(a^y^-x  b^y^ 


Yl  0 


^^2^1 


^2^2'^’ 


=  0 


(5.12) 


which  reduces  to 


^^1  ^l' 

^^2  ^2' 


/Yl  0 


Yl  0 


^Yi  0 


=  E<  (  \q 

\0  X  /  \0  y. 


(5.13) 


Let's  compute  the  terms  in  equations  (5.13). 


/Yl  0 


1  l/2\  /y,  0 


i^O  yJ  \  1/2  1 


.0  yJ\ 


L 


2  1/2 
>1/2  2 


lx  0 


'Yl  0 


VO  x/  \0  y. 


1  1/2^^ 
1/2  1  ! 


Hence , 


'^1  ^l\  /l  1/:  2  l/2\  ^ 


la^  b^/  \l/2  1  il/2  2 
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The  solution  for 


is  thus 


u^  =  (2  .  7/15  +  2/15)  =  16/15 

u^  =  (2  .  2/15  +  7/15)  y^  =  11/15  y^, 
as  was  established  before. 

Notice  that  a  diagonal  Q  would  have  decoupled  the  problem  by  permitting 
a  trivial  inversion  of  a  diagonal  matrix,  as  predicted  in  proposition  4.3. 

CONCLUSION 

We  have  presented  a  framework  for  the  design  of  distributed  estimation 
schemes  with  specific  architectures,  based  on  a  decision  theoretic  approach. 
For  a  fully  decentralized  architecture,  explicit  solutions  to  the  estima¬ 
tion  problem  were  described  and  illustrated  with  several  examples.  The 
examples  illustrate  that  the  complexity  of  the  decentralized  estimation 
scheme  is  critically  dependent  on  the  importance  of  the  cross-correlation 
of  errors  in  the  local  estimators,  which  are  represented  by  the  off-diagonal 
elements  of  the  positive  definite  matrix  Q.  Most  practical  systems  will  want 
to  weigh  heavily  the  correlation  of  local  errors.  For  example,  in  a  dis¬ 
tributed  surveillance  network,  it  is  important  that  errors  in  location  or 
detection  at  one  local  substation  be  corrected  by  other  substations.  In 
other  words,  it  is  very  costly  for  all  substations  to  err  in  the  same  way. 
This  is  reflected  in  the  performance  measure  by  the  off-diagonal  elements  of 

Q- 

The  examples  in  Section  5  illustrate  the  high  dimensionality  required 
by  the  local  estimators  in  order  to  compensate  for  correlations  in  their 
errors.  It  is  our  conjecture  that  the  dimensionality  of  the  local  estimators 
is  directly  related  to  the  number  of  off-diagonal  elemets  of  Q. 
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when  there  is  a  coordinator  station  present,  the  results  presented  in 
Section  3  provide  necessary  conditions  for  the  optimality  of  the  estimation 
operators.  Unfortunately,  the  coupling  between  decisions  at  the  local  sub¬ 
stations  and  the  information  available  to  the  coordinator  makes  the  analysis 
a  difficult  problem.  We  expect  that,  under  some  simplifying  assumptions, 
the  necessary  conditions  of  Section  3  can  lead  to  a  solution,  as  in  Section  4 
Such  results  have  been  reported  in  Willsky,  Castanon  et  al  [2]  for  a  simple 
class  of  performance  measures. 

The  formulation  of  Section  2  can  be  extended  to  incorporate  communi¬ 
cation  restrictions,  as  well  as  delays  in  the  transmission  of  local  decisions 
These  are  areas  which  will  be  studied  in  the  future. 
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